45 research outputs found
Latent Space Autoregression for Novelty Detection
Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying its latent representations through an autoregressive procedure.
We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and video anomaly detection settings. Differently from prior works, our proposal does not make any assumption about the nature of the novelties, making our work readily applicable to diverse contexts
Class-Incremental Continual Learning into the eXtended DER-verse
The staple of human intelligence is the capability of acquiring knowledge in
a continuous fashion. In stark contrast, Deep Networks forget catastrophically
and, for this reason, the sub-field of Class-Incremental Continual Learning
fosters methods that learn a sequence of tasks incrementally, blending
sequentially-gained knowledge into a comprehensive prediction.
This work aims at assessing and overcoming the pitfalls of our previous
proposal Dark Experience Replay (DER), a simple and effective approach that
combines rehearsal and Knowledge Distillation. Inspired by the way our minds
constantly rewrite past recollections and set expectations for the future, we
endow our model with the abilities to i) revise its replay memory to welcome
novel information regarding past data ii) pave the way for learning yet unseen
classes.
We show that the application of these strategies leads to remarkable
improvements; indeed, the resulting method - termed eXtended-DER (X-DER) -
outperforms the state of the art on both standard benchmarks (such as CIFAR-100
and miniImagenet) and a novel one here introduced. To gain a better
understanding, we further provide extensive ablation studies that corroborate
and extend the findings of our previous research (e.g. the value of Knowledge
Distillation and flatter minima in continual learning setups).Comment: 23 pages, 22 figures. To appear in IEEE TPAM
Continual semi-supervised learning through contrastive interpolation consistency
Continual Learning (CL) investigates how to train Deep Networks on a stream of tasks without incurring forgetting. CL settings proposed in literature assume that every incoming example is paired with ground-truth annotations. However, this clashes with many real-world applications: gathering labeled data, which is in itself tedious and expensive, becomes infeasible when data flow as a stream. This work explores Continual Semi-Supervised Learning (CSSL): here, only a small fraction of labeled input examples are shown to the learner. We assess how current CL methods (e.g.: EWC, LwF, iCaRL, ER, GDumb, DER) perform in this novel and challenging scenario, where overfitting entangles forgetting. Subsequently, we design a novel CSSL method that exploits metric learning and consistency regularization to leverage unlabeled examples while learning. We show that our proposal exhibits higher resilience to diminishing supervision and, even more surprisingly, relying only on supervision suffices to outperform SOTA methods trained under full supervision
Input Perturbation Reduces Exposure Bias in Diffusion Models
Denoising Diffusion Probabilistic Models have shown an impressive generation
quality, although their long sampling chain leads to high computational costs.
In this paper, we observe that a long sampling chain also leads to an error
accumulation phenomenon, which is similar to the exposure bias problem in
autoregressive text generation. Specifically, we note that there is a
discrepancy between training and testing, since the former is conditioned on
the ground truth samples, while the latter is conditioned on the previously
generated results. To alleviate this problem, we propose a very simple but
effective training regularization, consisting in perturbing the ground truth
samples to simulate the inference time prediction errors. We empirically show
that, without affecting the recall and precision, the proposed input
perturbation leads to a significant improvement in the sample quality while
reducing both the training and the inference times. For instance, on CelebA
6464, we achieve a new state-of-the-art FID score of 1.27, while saving
37.5% of the training time. The code is publicly available at
https://github.com/forever208/DDPM-IPComment: accepted by ICML 202